Enable "kick the tires" support for Nvidia GPUs in COS #45136

vishh · 2017-04-29T01:29:36Z

This PR provides an installation daemonset that will install Nvidia CUDA drivers on Google Container Optimized OS (COS).
User space libraries and debug utilities from the Nvidia driver installation are made available on the host in a special directory on the host -

/home/kubernetes/bin/nvidia/lib for libraries
/home/kubernetes/bin/nvidia/bin for debug utilities

Containers that run CUDA applications on COS are expected to consume the libraries and debug utilities (if necessary) from the host directories using HostPath volumes.

Note: This solution requires updating Pod Spec across distros. This is a known issue and will be addressed in the future. Until then CUDA workloads will not be portable.

This PR updates the COS base image version to m59. This is coupled with this PR for the following reasons:

Driver installation requires disabling a kernel feature in COS.
The kernel API for disabling this interface changed across COS versions
If the COS image update is not handled in this PR, then a subsequent COS image update will break GPU integration and will require an update to the installation scripts in this PR.
Instead of having to post 3 PRs, one each for adding the basic installer, updating COS to m59, and then updating the installer again, this PR combines all the changes to reduce review overhead and latency, and additional noise that will be created when GPU tests break.

Try out this PR

Get Quota for GPUs in any region
export KUBE_GCE_ZONE= KUBE_NODE_OS_DISTRIBUTION=gci
NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1" cluster/kube-up.sh
kubectl create -f cluster/gce/gci/nvidia-gpus/cos-installer-daemonset.yaml
Run your CUDA app in a pod.

Another option is to run a e2e manually to try out this PR

Get Quota for GPUs in any region
export KUBE_GCE_ZONE=<zone-with-gpus> KUBE_NODE_OS_DISTRIBUTION=gci
NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1"
go run hack/e2e.go -- --up
hack/ginkgo-e2e.sh --ginkgo.focus="\[Feature:GPU\]"
The e2e will install the drivers automatically using the daemonset and then run test workloads to validate driver integration.

TODO:

Update COS image version to m59 release.
Remove sleep from the install script and add it to the daemonset
Add an e2e that will run the daemonset and run a sample CUDA app on COS clusters.
Setup a test project with necessary quota to run GPU tests against HEAD to start with Adding CI e2e for testing GPU support in Kubernetes test-infra#2759
Update node e2e serial configs to install nvidia drivers on COS by default

k8s-reviewable · 2017-04-29T01:29:42Z

This change is

vishh · 2017-05-13T23:33:41Z

@Amey-D This PR works now. I see the e2e get's stuck at times on a new cluster trying to figure out if nodes have GPUs in it's capacity. I will try to get to the bottom of it soon.
Otherwise, this PR can be a good starting point. PTAL.

@mtaufen will you be able to review this patch?

mtaufen · 2017-05-15T17:51:34Z

cluster/gce/gci/configure-helper.sh

@@ -1593,4 +1593,5 @@ else
 fi
 reset-motd
 prepare-mounter-rootfs
+modprobe configs


For IKCONFIG? Note it is not listed in the GKE node image spec https://docs.google.com/document/d/1qmiJOuLYqjJZF-PTfn-xvTbLvzTgFw8gMcWavX7qiQ0/edit

So we may have to double check our images to ensure they are built with this module.

Good point. How are folks expected to discover the doc you posted? Can you add an entry for /proc/configz?

Oh, Nvidia driver installation may not require configs on all distros. I'm not sure about that.

@dchen1107 is the person to talk to about changing the node spec

mtaufen · 2017-05-15T17:55:34Z

cluster/gce/gci/nvidia-gpus/Dockerfile

+ENV DEBIAN_FRONTEND noninteractive
+
+RUN apt-get -qq update
+RUN apt-get install -qq -y pciutils gcc g++ git make dpkg-dev bc module-init-tools curl 


I believe -qq implies -y

mtaufen · 2017-05-15T18:01:24Z

cluster/gce/gci/nvidia-gpus/installer.sh

+NVIDIA_DRIVER_PKG_NAME="NVIDIA-Linux-x86_64-375.26.run"
+
+check_nvidia_device() {
+    lspci


Maybe worth grepping for NVIDIA devices here too, for less output?

I can optimize it by wrapping around a command, but I actually like verbose output.

fair enough

mtaufen · 2017-05-15T18:05:08Z

cluster/gce/gci/nvidia-gpus/installer.sh

+
+    echo "Running the Nvidia driver installer ..."
+    if ! sh "${NVIDIA_DRIVER_PKG_NAME}" --kernel-source-path="${KERNEL_SRC_DIR}" --silent --accept-license --keep --log-file-name="${log_file_name}"; then
+        echo "Nvidia installer failed, log below:"


Maybe also print where people can find the full log file

All the command run by this script will be printed - set -x. So the tail command below will display the log file name.

mtaufen · 2017-05-15T20:43:33Z

test/e2e/generated/BUILD

@@ -23,7 +23,7 @@ genrule(
    name = "bindata",
    srcs = [
        "//examples:sources",
-        "//test/images:sources",
+        "//cluster/gce/gci/nvidia-gpus:sources//test/images:sources",


looks like you might have accidentally ended up with two srcs in the same string here

Yeah. Fixing it. Good catch.

mtaufen · 2017-05-15T20:48:27Z

test/e2e/nvidia-gpus.go

+)
+
+func makeCudaAdditionTestPod() *v1.Pod {
+	podName := testPodNamePrefix + string(uuid.NewUUID())


reason to prefer this over GenerateName?

debugging....

mtaufen · 2017-05-15T20:48:33Z

test/e2e/nvidia-gpus.go

+	return testPod
+}
+
+func isClusterRunningCOS(f *framework.Framework) bool {


Might want to skip the master here, since theoretically you could have a non-cos master with cos nodes, and the nodes are where you care about GPU (though I don't think this os image skew happens in practice today). One way to do this would be to skip unschedulable nodes.

Why would unschedulable nodes have GPUs on them? Seems like an expensive unschedulable node.

Masters are marked unschedulable, so skipping unschedulables will skip masters. All I'm saying is that this test only cares that COS is on the nodes, and doesn't have to fail if the master is non-COS.

mtaufen · 2017-05-15T20:50:45Z

test/e2e/nvidia-gpus.go

+
+func areGPUsAvailableOnAllSchedulableNodes(f *framework.Framework) bool {
+	framework.Logf("Getting list of Nodes from API server")
+	nodeList, err := f.ClientSet.Core().Nodes().List(metav1.ListOptions{})


you could pass nodeList into these helper functions, instead of re-listing on every call

I actually want to re-list.

mtaufen · 2017-05-15T20:54:50Z

test/images/nvidia-cuda/Makefile

+IMAGE = $(REGISTRY)/cuda-vector-add
+
+build:
+	docker build --pull -t $(IMAGE):$(TAG) .


consider specifying arch as well

Good point. I'd like to do it later since I don't know the multi-arch story for CUDA yet.

mtaufen · 2017-05-15T20:56:42Z

cluster/gce/gci/nvidia-gpus/Makefile

+# limitations under the License.
+
+TAG=v0.1
+REGISTRY=gcr.io/google_containers


consider ?= for TAG and REGISTRY

done. good idea.

vishh

Addressed comments. PTAL.

vishh · 2017-05-16T03:10:57Z

cluster/gce/gci/nvidia-gpus/Dockerfile

+ENV DEBIAN_FRONTEND noninteractive
+
+RUN apt-get -qq update
+RUN apt-get install -qq -y pciutils gcc g++ git make dpkg-dev bc module-init-tools curl 


vishh · 2017-05-16T03:11:32Z

cluster/gce/gci/nvidia-gpus/Makefile

+# limitations under the License.
+
+TAG=v0.1
+REGISTRY=gcr.io/google_containers


done. good idea.

vishh · 2017-05-16T03:13:22Z

cluster/gce/gci/nvidia-gpus/Makefile

+all: container
+
+container:
+	docker build --pull -t ${REGISTRY}/${IMAGE}:${TAG} .


This is currently only meant for COS, which is restricted to amd64 on GCP. Do you think more code is useful?

vishh · 2017-05-16T03:23:56Z

cluster/gce/gci/nvidia-gpus/installer.sh

+NVIDIA_DRIVER_PKG_NAME="NVIDIA-Linux-x86_64-375.26.run"
+
+check_nvidia_device() {
+    lspci


I can optimize it by wrapping around a command, but I actually like verbose output.

vishh · 2017-05-16T03:24:48Z

cluster/gce/gci/nvidia-gpus/installer.sh

+
+    echo "Running the Nvidia driver installer ..."
+    if ! sh "${NVIDIA_DRIVER_PKG_NAME}" --kernel-source-path="${KERNEL_SRC_DIR}" --silent --accept-license --keep --log-file-name="${log_file_name}"; then
+        echo "Nvidia installer failed, log below:"


All the command run by this script will be printed - set -x. So the tail command below will display the log file name.

vishh · 2017-05-16T03:26:48Z

test/e2e/generated/BUILD

@@ -23,7 +23,7 @@ genrule(
    name = "bindata",
    srcs = [
        "//examples:sources",
-        "//test/images:sources",
+        "//cluster/gce/gci/nvidia-gpus:sources//test/images:sources",


Yeah. Fixing it. Good catch.

vishh · 2017-05-16T03:27:11Z

test/e2e/nvidia-gpus.go

+)
+
+func makeCudaAdditionTestPod() *v1.Pod {
+	podName := testPodNamePrefix + string(uuid.NewUUID())


debugging....

vishh · 2017-05-16T03:28:06Z

test/e2e/nvidia-gpus.go

+	return testPod
+}
+
+func isClusterRunningCOS(f *framework.Framework) bool {


Why would unschedulable nodes have GPUs on them? Seems like an expensive unschedulable node.

vishh · 2017-05-16T03:28:32Z

test/e2e/nvidia-gpus.go

+
+func areGPUsAvailableOnAllSchedulableNodes(f *framework.Framework) bool {
+	framework.Logf("Getting list of Nodes from API server")
+	nodeList, err := f.ClientSet.Core().Nodes().List(metav1.ListOptions{})


I actually want to re-list.

vishh · 2017-05-16T03:29:53Z

test/images/nvidia-cuda/Makefile

+IMAGE = $(REGISTRY)/cuda-vector-add
+
+build:
+	docker build --pull -t $(IMAGE):$(TAG) .


Good point. I'd like to do it later since I don't know the multi-arch story for CUDA yet.

vishh · 2017-05-20T13:41:10Z

Changes to hack/* are minimal. Adding an approval label.

vishh · 2017-05-21T01:25:32Z

@k8s-bot unit test this

vishh · 2017-05-21T01:27:36Z

@madhusudancs @nikhiljindal can either of you help me understand why the federation e2es are failing for this PR?
I'm updating the master and node image project dynamically as part of cluster setup. I could not figure out if federation has an independent logic for bringing up clusters. Kubemark had an independent setup script and I fixed that.

vishh · 2017-05-21T01:27:48Z

@k8s-bot kops aws e2e test this

vishh · 2017-05-21T01:27:58Z

@k8s-bot pull-kubernetes-federation-e2e-gce test this

…Optimized OS Packaged the script as a docker container stored in gcr.io/google-containers A daemonset deployment is included to make it easy to consume the installer A cluster e2e has been added to test the installation daemonset along with verifying installation by using a sample CUDA application. Node e2e for GPUs updated to avoid running on nodes without GPU devices. Signed-off-by: Vishnu kannan <vishnuk@google.com>

Signed-off-by: Vishnu kannan <vishnuk@google.com>

madhusudancs · 2017-05-21T05:52:13Z

@vishh please ignore federation tests for now. We are actively debugging the problem. It is not "required" right now, so PRs can merge without it passing.

vishh · 2017-05-21T07:32:47Z

@madhusudancs Thanks for the tip.
@mtaufen Let's get this merged as soon as you get a chance to review.

Amey-D · 2017-05-22T23:32:24Z

cluster/gce/util.sh

-  # Otherwise, we respect whatever is set by the user.
-  MASTER_IMAGE=${KUBE_GCE_MASTER_IMAGE:-${GCI_VERSION}}
-  MASTER_IMAGE_PROJECT=${KUBE_GCE_MASTER_PROJECT:-google-containers}
+    DEFAULT_GCI_PROJECT=google-containers


since you are changing the default version, why refer to google-containers project in context of gci here and elsewhere? why not just switch the projects to cos-cloud?

I can change the defaults here too. Honestly I don't think it matters. All these hacks would hopefully disappear soon.
At this point, I want this PR to go in and then I can clean things up.

mtaufen

LGTM (one typo in a comment, but that's it)

mtaufen · 2017-05-23T16:26:00Z

cluster/gce/gci/nvidia-gpus/installer.sh

+    # git checkout "tags/v${kernel_version_stripped}"
+    git checkout ${LAKITU_KERNEL_SHA1}
+
+    # Prepare kernel configu and source for modules.


s/configu/config

mtaufen · 2017-05-23T16:33:49Z

cluster/gce/util.sh

-  MASTER_IMAGE=${KUBE_GCE_MASTER_IMAGE:-${GCI_VERSION}}
-  MASTER_IMAGE_PROJECT=${KUBE_GCE_MASTER_PROJECT:-google-containers}
+    DEFAULT_GCI_PROJECT=google-containers
+    if [[ "${GCI_VERSION}" == "cos"* ]]; then


"cos"* is a neat trick, will have to add that to my toolbox.

mtaufen · 2017-05-23T16:46:48Z

/lgtm

k8s-github-robot · 2017-05-23T16:47:05Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mtaufen, vishh
We suggest the following additional approvers: gmarek, madhusudancs

Assign the PR to them by writing /assign @gmarek @madhusudancs in a comment when ready.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

k8s-ci-robot · 2017-05-23T16:55:04Z

@vishh: The following test(s) failed:

Test name	Commit	Details	Rerun command
Jenkins unit/integration	1968f78c425e54889e3ac502dc154f623bf04ade	link	`@k8s-bot unit test this`
Jenkins kops AWS e2e	1968f78c425e54889e3ac502dc154f623bf04ade	link	`@k8s-bot kops aws e2e test this`
pull-kubernetes-federation-e2e-gce	`333e571`	link	`@k8s-bot pull-kubernetes-federation-e2e-gce test this`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-github-robot · 2017-05-23T17:46:10Z

Automatic merge from submit-queue

yujuhong · 2017-05-23T20:03:30Z

test/e2e_node/jenkins/benchmark/benchmark-config.yaml

@@ -49,21 +49,21 @@ images:
    tests:
      - 'resource tracking for 105 pods per node \[Benchmark\]'
  gci-resource1:
-    image: gci-stable-56-9000-84-2
+    image: cos-beta-59-9460-20-0
    project: google-containers


Projects need to be updated to cos-cloud in this config.

Opps. I thought I updated everywhere. Apologies.

dchen1107 · 2017-05-23T20:43:42Z

cluster/gce/gci/nvidia-gpus/installer.sh

+}
+
+restart_kubelet() {
+    echo "Sending SIGTERM to kubelet"


Why do we need to restart Kubelet during the installation? This kind of dependencies could be error-prone: deploying and managing this nvidia-gpus daemonset depends on Kubelet, but in the middle of running the daemonset (installer.sh), Kubelet is restarted.

This is necessary for the kubelet to pick up the GPUs. Kubelet cannot support hotplugging GPUs for various reasons. We tried that with just PCI based data and it's proving to be hard.
Ideally, we need to reboot the entire node, but we haven't gotten there yet.

@vishh

Automatic merge from submit-queue (batch tested with PRs 46299, 46309, 46311, 46303, 46150) Fix cos image project to cos-cloud. Addressed #45136 (comment). @vishh @yujuhong @dchen1107

Automatic merge from submit-queue (batch tested with PRs 41563, 45251, 46265, 46462, 46721) change kubemark image project to match new cos image project The old project is not available anymore. #45136

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 29, 2017

vishh assigned Amey-D Apr 29, 2017

k8s-github-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-label-needed labels Apr 29, 2017

vishh force-pushed the cos-nvidia-driver-install branch 4 times, most recently from b880ab3 to bfa122d Compare May 5, 2017 23:58

k8s-github-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 8, 2017

vishh force-pushed the cos-nvidia-driver-install branch from 5eddeb4 to 2bc0629 Compare May 8, 2017 03:17

vishh changed the title ~~WIP: Automated install of nvidia drivers in COS.~~ Automated install of nvidia drivers in COS. May 10, 2017

k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 10, 2017

vishh force-pushed the cos-nvidia-driver-install branch from bc47260 to 86c5ae4 Compare May 10, 2017 22:47

k8s-github-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 11, 2017

vishh force-pushed the cos-nvidia-driver-install branch from 795840d to 35a6810 Compare May 13, 2017 23:29

k8s-github-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 13, 2017

vishh assigned mtaufen May 13, 2017

vishh added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels May 13, 2017

vishh added this to the v1.7 milestone May 13, 2017

vishh changed the title ~~Automated install of nvidia drivers in COS.~~ Enable "kick the tires" support for Nvidia GPUs in COS May 15, 2017

vishh mentioned this pull request May 15, 2017

Adding CI e2e for testing GPU support in Kubernetes kubernetes/test-infra#2759

Merged

vishh force-pushed the cos-nvidia-driver-install branch from 4daf007 to 6c4372d Compare May 15, 2017 20:02

mtaufen reviewed May 15, 2017

View reviewed changes

vishh force-pushed the cos-nvidia-driver-install branch from 6c4372d to 1b022fb Compare May 16, 2017 03:39

vishh commented May 16, 2017

View reviewed changes

vishh force-pushed the cos-nvidia-driver-install branch from 76c18a9 to 1968f78 Compare May 20, 2017 13:39

vishh added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 20, 2017

vishh added 2 commits May 20, 2017 21:17

Update COS version to m59

86b5edb

Signed-off-by: Vishnu kannan <vishnuk@google.com>

vishh force-pushed the cos-nvidia-driver-install branch from e106bd6 to 86b5edb Compare May 21, 2017 04:17

update default project to cos-cloud in gce configs

333e571

Signed-off-by: Vishnu kannan <vishnuk@google.com>

Amey-D reviewed May 22, 2017

View reviewed changes

mtaufen approved these changes May 23, 2017

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 23, 2017

k8s-github-robot merged commit 1e21058 into kubernetes:master May 23, 2017

yujuhong reviewed May 23, 2017

View reviewed changes

Random-Liu mentioned this pull request May 23, 2017

Fix cos image project to cos-cloud. #46303

Merged

dchen1107 reviewed May 23, 2017

View reviewed changes

vishh mentioned this pull request May 23, 2017

Clean up accelerator config before passing it to gcloud #46314

Closed

mikedanese mentioned this pull request May 31, 2017

change kubemark image project to match new cos image project #46721

Merged

RenaudWasTaken mentioned this pull request Sep 14, 2017

Added device plugin e2e kubelet failure test #52250

Merged

ruiwen-zhao mentioned this pull request Feb 18, 2021

Remove modprobe configs from configure-helper #99216

Merged

Enable "kick the tires" support for Nvidia GPUs in COS #45136

Enable "kick the tires" support for Nvidia GPUs in COS #45136

Conversation

vishh commented Apr 29, 2017 • edited

k8s-reviewable commented Apr 29, 2017

vishh commented May 13, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vishh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vishh commented May 20, 2017

vishh commented May 21, 2017

vishh commented May 21, 2017

vishh commented May 21, 2017

vishh commented May 21, 2017

madhusudancs commented May 21, 2017

vishh commented May 21, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtaufen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtaufen commented May 23, 2017

k8s-github-robot commented May 23, 2017

k8s-ci-robot commented May 23, 2017 • edited

k8s-github-robot commented May 23, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vishh commented Apr 29, 2017 •

edited

k8s-ci-robot commented May 23, 2017 •

edited